AITopics | arabic word

Collaborating Authors

arabic word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Americans 'creeped out' as ChatGPT starts inserting Arabic words into responses... before giving strange explanation

Daily Mail - Science & techApr-11-2026, 13:46:15 GMT

Ritzy Bay Area town torn apart after teacher's daughter, 16, was behind wheel when four friends died in high-speed crash... then she posted a TikTok video that poured fuel on the flames Two CIA officers killed in Mexico when their car skidded off ravine and exploded after meeting about bust of'largest ever drug lab' Insiders claim failed AI rollout could be to blame for Tim Cook's departure from Apple - as one says'the AI era requires a different kind of leadership' Trump confronts Xi as US forces seize Chinese ship carrying mysterious'gift' to Iran New'Hollywood dose' pill: A-listers hooked on'youth elixir' that dermatologists say is anti-ageing, shrinks pores, smooths wrinkles... and even banishes rosacea Days after we got engaged, the love of my life told me he'd killed a man and buried him in a bog. I reported him to police... but then I made this irreversible mistake Life-threatening cantaloupe recall in four states upgraded to FDA's highest risk level... 'reasonable probability of death' Fury as murderer marries pen pal behind bars... as teenage victim's mom says: 'I'm serving a life sentence without my son' Kate and William join Charles and Camilla in celebrating British centenarians at Buckingham Palace as Royal Family marks the late Queen's 100th birthday US troops board second tanker as Trump accuses Iran of violating ceasefire'numerous times' - Live updates AMANDA PLATELL: Why Sarah Ferguson - with the ghost of Princess Diana at her side - is ready to sensationally blow up the Royal Family. She knows ALL their secrets... New Jersey man's chilling'cancer map' fuels fears of poisoned neighborhood with 41 cases and counting How to lose weight when perimenopause sabotages your metabolism: I'm a trainer but when I hit 46, I piled on the pounds overnight. I was losing hair so fast a bald spot the size of an orange appeared. I owe my life to a $1 at-home treatment that REVERSED the damage in a month.

large language model, machine learning, natural language, (22 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Iran (0.44)
Europe > United Kingdom > England > Greater London > London (0.34)
North America > United States > New Jersey (0.24)
(20 more...)

Genre: Personal > Human Interest (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

From A for algebra to T for tariffs: Arabic words used in English speech

Al JazeeraDec-18-2025, 05:03:58 GMT

Arabic is one of the world's most widely spoken languages with at least 400 million speakers, including 200 million native speakers and 200 million to 250 million non-native speakers. Modern Standard Arabic (MSA) serves as the formal language for government, legal matters and education, and it is widely used in international and religious contexts. Additionally, more than 25 dialects are spoken primarily across the Middle East and North Africa. The date was chosen to mark the day in 1973 on which the UN General Assembly adopted Arabic as one of its six official languages. In the following visual explainer, Al Jazeera lists some of the most common words in today's English language that originated from Arabic or passed through Arabic before reaching English.

algebra, arabic word, english speech, (12 more...)

Al Jazeera

Country:

Europe > Middle East (0.25)
Africa > North Africa (0.25)
Africa > Middle East (0.25)
(10 more...)

Industry:

Law (0.36)
Government (0.35)

Technology: Information Technology > Artificial Intelligence (0.90)

Add feedback

Building and Aligning Comparable Corpora

Saad, Motaz, Langlois, David, Smaili, Kamel

arXiv.org Artificial IntelligenceAug-5-2025

Comparable corpus is a set of topic aligned documents in multiple languages, which are not necessarily translations of each other. These documents are useful for multilingual natural language processing when there is no parallel text available in some domains or languages. In addition, comparable documents are informative because they can tell what is being said about a topic in different languages. In this paper, we present a method to build comparable corpora from Wikipedia encyclopedia and EURONEWS website in English, French and Arabic languages. We further experiment a method to automatically align comparable documents using cross-lingual similarity measures. We investigate two cross-lingual similarity measures to align comparable documents. The first measure is based on bilingual dictionary, and the second measure is based on Latent Semantic Indexing (LSI). Experiments on several corpora show that the Cross-Lingual LSI (CL-LSI) measure outperforms the dictionary based measure. Finally, we collect English and Arabic news documents from the British Broadcast Corporation (BBC) and from ALJAZEERA (JSC) news website respectively. Then we use the CL-LSI similarity measure to automatically align comparable documents of BBC and JSC. The evaluation of the alignment shows that CL-LSI is not only able to align cross-lingual documents at the topic level, but also it is able to do this at the event level.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.02555

Country:

Europe (1.00)
Asia > Middle East (0.93)
North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Media > News (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ArEEG_Words: Dataset for Envisioned Speech Recognition using EEG for Arabic Words

Darwish, Hazem, Malah, Abdalrahman Al, Jallad, Khloud Al, Ghneim, Nada

arXiv.org Artificial IntelligenceNov-27-2024

Brain-Computer-Interface (BCI) aims to support communication-impaired patients by translating neural signals into speech. A notable research topic in BCI involves Electroencephalography (EEG) signals that measure the electrical activity in the brain. While significant advancements have been made in BCI EEG research, a major limitation still exists: the scarcity of publicly available EEG datasets for non-English languages, such as Arabic. To address this gap, we introduce in this paper ArEEG_Words dataset, a novel EEG dataset recorded from 22 participants with mean age of 22 years (5 female, 17 male) using a 14-channel Emotiv Epoc X device. The participants were asked to be free from any effects on their nervous system, such as coffee, alcohol, cigarettes, and so 8 hours before recording. They were asked to stay calm in a clam room during imagining one of the 16 Arabic Words for 10 seconds. The words include 16 commonly used words such as up, down, left, and right. A total of 352 EEG recordings were collected, then each recording was divided into multiple 250ms signals, resulting in a total of 15,360 EEG signals. To the best of our knowledge, ArEEG_Words data is the first of its kind in Arabic EEG domain. Moreover, it is publicly available for researchers as we hope that will fill the gap in Arabic EEG research.

artificial intelligence, participant, speech recognition, (9 more...)

arXiv.org Artificial Intelligence

2411.18888

Country:

Europe > Portugal (0.04)
Europe > Netherlands (0.04)
Europe > Belgium > Flanders (0.04)
Asia > Middle East > Syria (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.69)
Leisure & Entertainment > Sports > Golf (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.51)

Add feedback

Crowdsourcing Lexical Diversity

Khalilia, Hadi, Otterbacher, Jahna, Bella, Gabor, Noortyani, Rusma, Darma, Shandy, Giunchiglia, Fausto

arXiv.org Artificial IntelligenceOct-30-2024

Lexical-semantic resources (LSRs), such as online lexicons or wordnets, are fundamental for natural language processing applications. In many languages, however, such resources suffer from quality issues: incorrect entries, incompleteness, but also, the rarely addressed issue of bias towards the English language and Anglo-Saxon culture. Such bias manifests itself in the absence of concepts specific to the language or culture at hand, the presence of foreign (Anglo-Saxon) concepts, as well as in the lack of an explicit indication of untranslatability, also known as cross-lingual \emph{lexical gaps}, when a term has no equivalent in another language. This paper proposes a novel crowdsourcing methodology for reducing bias in LSRs. Crowd workers compare lexemes from two languages, focusing on domains rich in lexical diversity, such as kinship or food. Our LingoGap crowdsourcing tool facilitates comparisons through microtasks identifying equivalent terms, language-specific terms, and lexical gaps across languages. We validated our method by applying it to two case studies focused on food-related terminology: (1) English and Arabic, and (2) Standard Indonesian and Banjarese. These experiments identified 2,140 lexical gaps in the first case study and 951 in the second. The success of these experiments confirmed the usability of our method and tool for future large-scale lexicon enrichment tasks.

asian low-resour, experiment, lexical gap, (14 more...)

arXiv.org Artificial Intelligence

2410.23133

Country:

Europe > United Kingdom > UK North Sea (0.05)
Atlantic Ocean > North Atlantic Ocean > North Sea > UK North Sea (0.05)
Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
(30 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Arabic Handwritten Text Line Dataset

Bouchal, Hakim, Belaid, Ahror

arXiv.org Artificial IntelligenceDec-10-2023

Segmentation of Arabic manuscripts into lines of text and words is an important step to make recognition systems more efficient and accurate. The problem of segmentation into text lines is solved since there are carefully annotated dataset dedicated to this task. However, To the best of our knowledge, there are no dataset annotating the word position of Arabic texts. In this paper, we present a new dataset specifically designed for historical Arabic script in which we annotate position in word level.

dataset, text line, university, (14 more...)

arXiv.org Artificial Intelligence

2312.07573

Country:

Africa > Middle East > Algeria > Béjaïa Province > Béjaïa (0.06)
North America > United States > New York > Niagara County > Niagara Falls (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.05)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision > Handwriting Recognition (0.52)

Add feedback

ARCOQ: Arabic Closest Opposite Questions Dataset

Rizkallah, Sandra, Atiya, Amir F., Shaheen, Samir

arXiv.org Artificial IntelligenceOct-22-2023

This paper presents a dataset for closest opposite questions in Arabic language. The dataset is the first of its kind for the Arabic language. It is beneficial for the assessment of systems on the aspect of antonymy detection. The structure is similar to that of the Graduate Record Examination (GRE) closest opposite questions dataset for the English language. The introduced dataset consists of 500 questions, each contains a query word for which the closest opposite needs to be determined from among a set of candidate words. Each question is also associated with the correct answer. We publish the dataset publicly in addition to providing standard splits of the dataset into development and test sets. Moreover, the paper provides a benchmark for the performance of different Arabic word embedding models on the introduced dataset.

arXiv.org Artificial Intelligence

2310.14384

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Vietnam > Thái Nguyên Province > Thái Nguyên (0.04)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

ASMDD: Arabic Speech Mispronunciation Detection Dataset

Aly, Salah A., Salah, Abdelrahman, Eraqi, Hesham M.

arXiv.org Artificial IntelligenceNov-1-2021

The largest dataset of Arabic speech mispronunciation detections in Egyptian dialogues is introduced. The dataset is composed of annotated audio files representing the top 100 words that are most frequently used in the Arabic language, pronounced by 100 Egyptian children (aged between 2 and 8 years old). The dataset is collected and annotated on segmental pronunciation error detections by expert listeners.

arabic speech mispronunciation detection dataset, audio file, dataset, (10 more...)

arXiv.org Artificial Intelligence

2111.01136

Country: Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.05)

Genre: Research Report (0.40)

Industry: Education (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

DiaLex: A Benchmark for Evaluating Multidialectal Arabic Word Embeddings

Abdul-Mageed, Muhammad, Elbassuoni, Shady, Doughman, Jad, Elmadany, AbdelRahim, Nagoudi, El Moatez Billah, Zoughby, Yorgo, Shaher, Ahmad, Gaba, Iskander, Helal, Ahmed, El-Razzaz, Mohammed

arXiv.org Artificial IntelligenceNov-22-2020

Word embeddings are a core component of modern natural language processing systems, making the ability to thoroughly evaluate them a vital task. We describe DiaLex, a benchmark for intrinsic evaluation of dialectal Arabic word embedding. DiaLex covers five important Arabic dialects: Algerian, Egyptian, Lebanese, Syrian, and Tunisian. Across these dialects, DiaLex provides a testbank for six syntactic and semantic relations, namely male to female, singular to dual, singular to plural, antonym, comparative, and genitive to past tense. DiaLex thus consists of a collection of word pairs representing each of the six relations in each of the five dialects. To demonstrate the utility of DiaLex, we use it to evaluate a set of existing and new Arabic word embeddings that we developed. Our benchmark, evaluation code, and new word embedding models will be publicly available.

dialect, dialex, evaluation, (16 more...)

arXiv.org Artificial Intelligence

2011.1097

Country:

Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.04)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Using Artificial Intelligence to Read Arabic Comics - Al-Fanar Media

#artificialintelligenceMar-19-2018, 23:27:31 GMT

Arabic comics have in recent years grown into a thriving creative movement. BEIRUT--A computer scientist at the American University of Beirut is using artificial intelligence to classify the content of Arabic comics, applying the computer-based science to this cutting-edge art form in the Arab world. Artificial-intelligence specialists are always trying to stretch the capabilities of computer brainpower. If artificial intelligence can be used to play the ancient Chinese board game Go, or the American TV quiz game Jeopardy, then Arabic comics are also fair game. "I try to look for unusual applications for artificial intelligence and machine learning," explained Mariette Awad, the associate professor in the department of electrical and computer engineering at the American University of Beirut who is leading the project.

arabic comic, artificial intelligence, natural language, (11 more...)

#artificialintelligence

Country: Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.68)

Industry: Leisure & Entertainment > Games (0.56)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.72)
Information Technology > Artificial Intelligence > Natural Language (0.54)

Add feedback